Novel low-overhead roll-forward recovery scheme for distributed systems

نویسندگان

Bidyut Gupta

Shahram Rahimi

Ziping Liu

چکیده

An efficient roll-forward checkpointing/recovery scheme for distributed systems has been presented. This work is an improvement of our earlier work. The use of the concept of forced checkpoints helps to design a single phase non-blocking algorithm to find consistent global checkpoints. It offers the main advantages of both the synchronous and the asynchronous approaches, that is simple recovery and simple way to create checkpoints. The algorithm produces reduced number of checkpoints. Since each process independently takes its decision whether to take a forced checkpoint or not, it makes the algorithm simple, fast and efficient. The proposed work offers better performance than some noted existing works. Besides, the advantages stated above also ensure that the algorithm can work efficiently in mobile computing environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cache based fault recovery for distributed systems

No cache based techniques for roll forward fault re covery exist at present A split cache approach is pro posed that provides e cient support for checkpointing and roll forward fault recovery in distributed systems This approach obviates the use of discrete stable stor age or explicit synchronization among the processors Stability of the checkpoint intervals is used as a driver for real time op...

متن کامل

Distributed Recovery Units: An Approach for Hybrid and Adaptive Distributed Recovery

Traditionally, distributed recovery schemes have been designed for systems consisting of multiple recovery units. Each recovery unit (RU) resides on a single processor and it can fail and recover as a whole. This report introduces the \distributed recovery unit (DRU)" abstraction as an approach for design of \hybrid" and \adaptive" recovery schemes for distributed systems. The distributed syste...

متن کامل

Another Two - Level Failure Recovery Scheme : Performance

This report deals with the design and evaluation of a \two-level" failure recovery scheme for distributed systems. In our previous work 30, 32], we motivated a \two-level" recovery approach that tolerates the more probable failures with a low overhead, and less probable failures with possibly higher overhead. The two-level approach can achieve a smaller overhead as compared to traditional recov...

متن کامل

Roll-Forward and Rollback Recovery: Performance-Reliability Trade-Off

Trade-O Dhiraj K. Pradhan Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 fpradhan,[email protected] Abstract Performance and reliability achieved by a modular redundant system depend on the recovery scheme used. Typically, gain in performance using comparable resources results in reduced reliability. Several highperformance computers are not...

متن کامل

An Improved Logging and Checkpointing Scheme for Recoverable Distributed Shared Memory

The distributed shared memory(DSM) system transforms an existing network of workstations to a powerful shared-memory parallel computer which could deliver superior price/performance. However, with more workstations engaged in the system and longer execution time, the probability of faults increases which could render the system useless. Several checkpointing and logging schemes have been propos...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IET Computers & Digital Techniques

دوره 1 شماره

صفحات -

تاریخ انتشار 2007

Novel low-overhead roll-forward recovery scheme for distributed systems

نویسندگان

چکیده

منابع مشابه

Cache based fault recovery for distributed systems

Distributed Recovery Units: An Approach for Hybrid and Adaptive Distributed Recovery

Another Two - Level Failure Recovery Scheme : Performance

Roll-Forward and Rollback Recovery: Performance-Reliability Trade-Off

An Improved Logging and Checkpointing Scheme for Recoverable Distributed Shared Memory

عنوان ژورنال:

اشتراک گذاری